Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Free, publicly-accessible full text available March 31, 2026
- 
            Free, publicly-accessible full text available February 28, 2026
- 
            Abstract Single-cell RNA sequencing (scRNA-seq) enables dissecting cellular heterogeneity in tissues, resulting in numerous biological discoveries. Various computational methods have been devised to delineate cell types by clustering scRNA-seq data, where clusters are often annotated using prior knowledge of marker genes. In addition to identifying pure cell types, several methods have been developed to identify cells undergoing state transitions, which often rely on prior clustering results. The present computational approaches predominantly investigate the local and first-order structures of scRNA-seq data using graph representations, while scRNA-seq data frequently display complex high-dimensional structures. Here, we introduce scGeom, a tool that exploits the multiscale and multidimensional structures in scRNA-seq data by analyzing the geometry and topology through curvature and persistent homology of both cell and gene networks. We demonstrate the utility of these structural features to reflect biological properties and functions in several applications, where we show that curvatures and topological signatures of cell and gene networks can help indicate transition cells and the differentiation potential of cells. We also illustrate that structural characteristics can improve the classification of cell types.more » « less
- 
            LMNA-related dilated cardiomyopathy (DCM) is an autosomal-dominant genetic condition with cardiomyocyte and conduction system dysfunction often resulting in heart failure or sudden death. The condition is caused by mutation in the Lamin A/C (LMNA) gene encoding Type-A nuclear lamin proteins involved in nuclear integrity, epigenetic regulation of gene expression, and differentiation. The molecular mechanisms of the disease are not completely understood, and there are no definitive treatments to reverse progression or prevent mortality. We investigated possible mechanisms of LMNA-related DCM using induced pluripotent stem cells derived from a family with a heterozygous LMNA c.357-2A>G splice-site mutation. We differentiated one LMNA-mutant iPSC line derived from an affected female (Patient) and two non-mutant iPSC lines derived from her unaffected sister (Control) and conducted single-cell RNA sequencing for 12 samples (four from Patients and eight from Controls) across seven time points: Day 0, 2, 4, 9, 16, 19, and 30. Our bioinformatics workflow identified 125,554 cells in raw data and 110,521 (88%) high-quality cells in sequentially processed data. Unsupervised clustering, cell annotation, and trajectory inference found complex heterogeneity: ten main cell types; many possible subtypes; and lineage bifurcation for cardiac progenitors to cardiomyocytes (CMs) and epicardium-derived cells (EPDCs). Data integration and comparative analyses of Patient and Control cells found cell type and lineage-specific differentially expressed genes (DEGs) with enrichment, supporting pathway dysregulation. Top DEGs and enriched pathways included 10 ZNF genes and RNA polymerase II transcription in pluripotent cells (PP); BMP4 and TGF Beta/BMP signaling, sarcomere gene subsets and cardiogenesis, CDH2 and EMT in CMs; LMNA and epigenetic regulation, as well as DDIT4 and mTORC1 signaling in EPDCs. Top DEGs also included XIST and other X-linked genes, six imprinted genes (SNRPN, PWAR6, NDN, PEG10, MEG3, MEG8), and enriched gene sets related to metabolism, proliferation, and homeostasis. We confirmed Lamin A/C haploinsufficiency by allelic expression and Western blot. Our complex Patient-derived iPSC model for Lamin A/C haploinsufficiency in PP, CM, and EPDC provided support for dysregulation of genes and pathways, many previously associated with Lamin A/C defects, such as epigenetic gene expression, signaling, and differentiation. Our findings support disruption of epigenomic developmental programs, as proposed in other LMNA disease models. We recognized other factors influencing epigenetics and differentiation; thus, our approach needs improvement to further investigate this mechanism in an iPSC-derived model.more » « less
- 
            High-dimensional multimodal data arises in many scientific fields. The integration of multimodal data becomes challenging when there is no known correspondence between the samples and the features of different datasets. To tackle this challenge, we introduce AVIDA, a framework for simultaneously performing data alignment and dimension reduction. In the numerical experiments, Gromov-Wasserstein optimal transport and t-distributed stochastic neighbor embedding are used as the alignment and dimension reduction modules respectively. We show that AVIDA correctly aligns high-dimensional datasets without common features with four synthesized datasets and two real multimodal single-cell datasets. Compared to several existing methods, we demonstrate that AVIDA better preserves structures of individual datasets, especially distinct local structures in the joint low-dimensional visualization, while achieving comparable alignment performance. Such a property is important in multimodal single-cell data analysis as some biological processes are uniquely captured by one of the datasets. In general applications, other methods can be used for the alignment and dimension reduction modules.more » « less
- 
            Spatial transcriptomic technologies and spatially annotated single-cell RNA sequencing datasets provide unprecedented opportunities to dissect cell–cell communication (CCC). However, incorporation of the spatial information and complex biochemical processes required in the reconstruction of CCC remains a major challenge. Here, we present COMMOT (COMMunication analysis by Optimal Transport) to infer CCC in spatial transcriptomics, which accounts for the competition between different ligand and receptor species as well as spatial distances between cells. A collective optimal transport method is developed to handle complex molecular interactions and spatial constraints. Furthermore, we introduce downstream analysis tools to infer spatial signaling directionality and genes regulated by signaling using machine learning models. We apply COMMOT to simulation data and eight spatial datasets acquired with five different technologies to show its effectiveness and robustness in identifying spatial CCC in data with varying spatial resolutions and gene coverages. Finally, COMMOT identifies new CCCs during skin morphogenesis in a case study of human epidermal development.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available